Overview

Dataset statistics

Number of variables9
Number of observations581
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.0 KiB
Average record size in memory72.2 B

Variable types

Numeric9

Alerts

Pregnancies is highly correlated with AgeHigh correlation
Age is highly correlated with PregnanciesHigh correlation
Pregnancies is highly correlated with AgeHigh correlation
Age is highly correlated with PregnanciesHigh correlation
Pregnancies is highly correlated with AgeHigh correlation
SkinThickness is highly correlated with Insulin and 1 other fieldsHigh correlation
Insulin is highly correlated with SkinThicknessHigh correlation
BMI is highly correlated with SkinThicknessHigh correlation
Age is highly correlated with PregnanciesHigh correlation
level_0 has unique values Unique
Age has 12 (2.1%) zeros Zeros

Reproduction

Analysis started2022-09-20 06:40:04.989387
Analysis finished2022-09-20 06:40:21.936652
Duration16.95 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

level_0
Real number (ℝ)

UNIQUE

Distinct581
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0
Minimum-1.722624772
Maximum1.73484803
Zeros0
Zeros (%)0.0%
Negative290
Negative (%)49.9%
Memory size4.7 KiB
2022-09-20T12:10:22.097923image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1.722624772
5-th percentile-1.549523367
Q1-0.8707836467
median0.003833978942
Q30.8465645036
95-th percentile1.570857225
Maximum1.73484803
Range3.457472801
Interquartile range (IQR)1.71734815

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)nan
Kurtosis-1.190774234
Mean0
Median Absolute Deviation (MAD)0.8609517252
Skewness0.01587279078
Sum1.465494393 × 10-14
Variance1.001724138
MonotonicityStrictly increasing
2022-09-20T12:10:22.289327image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1.7226247721
 
0.2%
0.41836629111
 
0.2%
0.5504699951
 
0.2%
0.55502529511
 
0.2%
0.55958059521
 
0.2%
0.56413589541
 
0.2%
0.56869119551
 
0.2%
0.57324649561
 
0.2%
0.58235709591
 
0.2%
0.5869123961
 
0.2%
Other values (571)571
98.3%
ValueCountFrequency (%)
-1.7226247721
0.2%
-1.7180694721
0.2%
-1.7089588711
0.2%
-1.6998482711
0.2%
-1.6952929711
0.2%
-1.6907376711
0.2%
-1.6861823711
0.2%
-1.681627071
0.2%
-1.677071771
0.2%
-1.667961171
0.2%
ValueCountFrequency (%)
1.734848031
0.2%
1.730292731
0.2%
1.7257374291
0.2%
1.7211821291
0.2%
1.7120715291
0.2%
1.7075162291
0.2%
1.7029609291
0.2%
1.6938503281
0.2%
1.6892950281
0.2%
1.6847397281
0.2%

Pregnancies
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.668895365 × 10-17
Minimum-1.136042381
Maximum3.932020219
Zeros0
Zeros (%)0.0%
Negative326
Negative (%)56.1%
Memory size4.7 KiB
2022-09-20T12:10:22.466547image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1.136042381
5-th percentile-1.136042381
Q1-0.8379210515
median-0.2416783927
Q30.6526855955
95-th percentile1.845170913
Maximum3.932020219
Range5.0680626
Interquartile range (IQR)1.490606647

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)-2.727964682 × 1016
Kurtosis0.3666248812
Mean-3.668895365 × 10-17
Median Absolute Deviation (MAD)0.5962426588
Skewness0.9692620925
Sum-2.131628207 × 10-14
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:22.632077image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
-0.8379210515103
17.7%
-0.539799722182
14.1%
-1.13604238181
13.9%
-0.241678392760
10.3%
0.0564429367252
9.0%
0.354564266143
7.4%
0.652685595538
 
6.5%
0.950806924933
 
5.7%
1.54704958422
 
3.8%
1.24892825422
 
3.8%
Other values (7)45
7.7%
ValueCountFrequency (%)
-1.13604238181
13.9%
-0.8379210515103
17.7%
-0.539799722182
14.1%
-0.241678392760
10.3%
0.0564429367252
9.0%
0.354564266143
7.4%
0.652685595538
 
6.5%
0.950806924933
 
5.7%
1.24892825422
 
3.8%
1.54704958422
 
3.8%
ValueCountFrequency (%)
3.9320202191
 
0.2%
3.335777561
 
0.2%
3.0376562312
 
0.3%
2.7395349017
 
1.2%
2.4414135726
 
1.0%
2.1432922438
 
1.4%
1.84517091320
3.4%
1.54704958422
3.8%
1.24892825422
3.8%
0.950806924933
5.7%

Glucose
Real number (ℝ)

Distinct116
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.815425167 × 10-17
Minimum-2.304010909
Maximum2.629880873
Zeros0
Zeros (%)0.0%
Negative319
Negative (%)54.9%
Memory size4.7 KiB
2022-09-20T12:10:22.813269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2.304010909
5-th percentile-1.40337987
Q1-0.6985381864
median-0.1503279883
Q30.554513695
95-th percentile1.964197061
Maximum2.629880873
Range4.933891783
Interquartile range (IQR)1.253051881

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)-2.078449281 × 1016
Kurtosis-0.08084846299
Mean-4.815425167 × 10-17
Median Absolute Deviation (MAD)0.6265259407
Skewness0.5422891697
Sum-2.842170943 × 10-14
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:23.018958image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.620222443817
 
2.9%
-0.502748829913
 
2.2%
-0.77685392913
 
2.2%
0.397882209813
 
2.2%
-0.581064572513
 
2.2%
-0.150327988313
 
2.2%
-0.267801602212
 
2.1%
-0.346117344712
 
2.1%
-0.11117011711
 
1.9%
-0.228643730911
 
1.9%
Other values (106)453
78.0%
ValueCountFrequency (%)
-2.3040109091
 
0.2%
-2.2648530381
 
0.2%
-2.1082215531
 
0.2%
-2.0690636821
 
0.2%
-1.9515900681
 
0.2%
-1.8732743251
 
0.2%
-1.8341164542
0.3%
-1.716642843
0.5%
-1.6774849691
 
0.2%
-1.6383270973
0.5%
ValueCountFrequency (%)
2.6298808731
 
0.2%
2.5907230022
 
0.3%
2.5515651313
0.5%
2.512407265
0.9%
2.4732493881
 
0.2%
2.3557757742
 
0.3%
2.3166179031
 
0.2%
2.2774600323
0.5%
2.1991442892
 
0.3%
2.1599864181
 
0.2%

BloodPressure
Real number (ℝ)

Distinct35
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1.964387727 × 10-16
Minimum-2.584642613
Maximum2.56416979
Zeros0
Zeros (%)0.0%
Negative303
Negative (%)52.2%
Memory size4.7 KiB
2022-09-20T12:10:23.199378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2.584642613
5-th percentile-1.631158835
Q1-0.6776750564
median-0.1055847893
Q30.6572022334
95-th percentile1.801382768
Maximum2.56416979
Range5.148812403
Interquartile range (IQR)1.33487729

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)-5.095031312 × 1015
Kurtosis-0.2903835923
Mean-1.964387727 × 10-16
Median Absolute Deviation (MAD)0.7627870227
Skewness0.03972050431
Sum-1.172395514 × 10-13
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:23.459825image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
-0.105584789342
 
7.2%
0.27580872242
 
7.2%
0.0851119663637
 
6.4%
-0.677675056436
 
6.2%
-0.29628154535
 
6.0%
0.657202233434
 
5.9%
0.847898989133
 
5.7%
-0.190876892932
 
5.5%
-1.05906856832
 
5.5%
-0.86837181226
 
4.5%
Other values (25)232
39.9%
ValueCountFrequency (%)
-2.5846426133
 
0.5%
-2.3939458581
 
0.2%
-2.2032491024
 
0.7%
-2.01255234610
 
1.7%
-1.821855599
 
1.5%
-1.63115883511
 
1.9%
-1.5358104572
 
0.3%
-1.44046207911
 
1.9%
-1.24976532317
2.9%
-1.05906856832
5.5%
ValueCountFrequency (%)
2.564169791
 
0.2%
2.3734730354
 
0.7%
2.2781246571
 
0.2%
2.1827762794
 
0.7%
1.9920795236
 
1.0%
1.80138276814
2.4%
1.61068601217
2.9%
1.41998925616
2.8%
1.3246408784
 
0.7%
1.229292515
2.6%

SkinThickness
Real number (ℝ)

HIGH CORRELATION

Distinct39
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-8.866497133 × 10-17
Minimum-2.139027761
Maximum2.638164328
Zeros0
Zeros (%)0.0%
Negative356
Negative (%)61.3%
Memory size4.7 KiB
2022-09-20T12:10:23.639324image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2.139027761
5-th percentile-1.384734273
Q1-0.562999614
median-0.562999614
Q30.6267150276
95-th percentile1.88387084
Maximum2.638164328
Range4.777192089
Interquartile range (IQR)1.189714642

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)-1.128812972 × 1016
Kurtosis-0.1786351292
Mean-8.866497133 × 10-17
Median Absolute Deviation (MAD)0.444587915
Skewness0.6884450763
Sum-2.842170943 × 10-14
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:23.826317image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
-0.562999614195
33.6%
0.878146190226
 
4.5%
0.626715027624
 
4.1%
-0.253294041418
 
3.1%
0.249568283817
 
2.9%
-0.881871947816
 
2.8%
0.752430608915
 
2.6%
-0.756156366515
 
2.6%
0.37528386515
 
2.6%
-0.001862878814
 
2.4%
Other values (29)226
38.9%
ValueCountFrequency (%)
-2.1390277612
 
0.3%
-1.8875965984
 
0.7%
-1.7618810176
 
1.0%
-1.6361654356
 
1.0%
-1.51044985410
1.7%
-1.3847342734
 
0.7%
-1.25901869212
2.1%
-1.133303115
 
0.9%
-1.00758752912
2.1%
-0.881871947816
2.8%
ValueCountFrequency (%)
2.6381643286
1.0%
2.5124487473
 
0.5%
2.3867331662
 
0.3%
2.2610175843
 
0.5%
2.1353020035
 
0.9%
2.00958642210
1.7%
1.8838708414
2.4%
1.75815525911
1.9%
1.6324396785
 
0.9%
1.50672409710
1.7%

Insulin
Real number (ℝ)

HIGH CORRELATION

Distinct110
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.936591615 × 10-18
Minimum-2.319812987
Maximum3.117432993
Zeros0
Zeros (%)0.0%
Negative432
Negative (%)74.4%
Memory size4.7 KiB
2022-09-20T12:10:24.010210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2.319812987
5-th percentile-1.275370256
Q1-0.3292440173
median-0.3292440173
Q30.04554260895
95-th percentile2.380179301
Maximum3.117432993
Range5.437245979
Interquartile range (IQR)0.3747866262

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)1.007248498 × 1017
Kurtosis1.763993393
Mean9.936591615 × 10-18
Median Absolute Deviation (MAD)0
Skewness1.296839799
Sum1.776356839 × 10-14
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:24.202185image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.3292440173321
55.2%
0.444888358811
 
1.9%
0.10698041667
 
1.2%
1.2128609557
 
1.2%
0.90567191647
 
1.2%
1.3664554746
 
1.0%
1.5200499936
 
1.0%
0.29129383976
 
1.0%
2.7488061475
 
0.9%
-0.75314889085
 
0.9%
Other values (100)200
34.4%
ValueCountFrequency (%)
-2.3198129871
 
0.2%
-2.2890940831
 
0.2%
-2.2276562752
0.3%
-2.104780661
 
0.2%
-2.0740617562
0.3%
-1.8897483331
 
0.2%
-1.7975916211
 
0.2%
-1.6747160063
0.5%
-1.6439971022
0.3%
-1.6132781981
 
0.2%
ValueCountFrequency (%)
3.1174329932
 
0.3%
3.0867140891
 
0.2%
3.0559951854
0.7%
2.9945573771
 
0.2%
2.9024006661
 
0.2%
2.8716817621
 
0.2%
2.8409628581
 
0.2%
2.8102439542
 
0.3%
2.7488061475
0.9%
2.6873683391
 
0.2%

BMI
Real number (ℝ)

HIGH CORRELATION

Distinct215
Distinct (%)37.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.62563956 × 10-16
Minimum-2.161400024
Maximum2.543884128
Zeros0
Zeros (%)0.0%
Negative293
Negative (%)50.4%
Memory size4.7 KiB
2022-09-20T12:10:24.384605image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2.161400024
5-th percentile-1.552674452
Q1-0.7465243696
median-0.02263450007
Q30.618995157
95-th percentile1.869350386
Maximum2.543884128
Range4.705284152
Interquartile range (IQR)1.365519527

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)1.779107401 × 1015
Kurtosis-0.3175777543
Mean5.62563956 × 10-16
Median Absolute Deviation (MAD)0.6745337421
Skewness0.2578962716
Sum3.455014053 × 10-13
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:24.575131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.174790009810
 
1.7%
0.0431736698910
 
1.7%
0.10776078989
 
1.5%
-0.022634500079
 
1.5%
0.10898183989
 
1.5%
-0.088442670039
 
1.5%
-0.26941513748
 
1.4%
0.37221451977
 
1.2%
0.24059817987
 
1.2%
0.32285839227
 
1.2%
Other values (205)496
85.4%
ValueCountFrequency (%)
-2.1614000243
0.5%
-2.1284959391
 
0.2%
-2.0133316411
 
0.2%
-1.9804275561
 
0.2%
-1.9639755141
 
0.2%
-1.9475234712
0.3%
-1.9310714291
 
0.2%
-1.8817153011
 
0.2%
-1.8652632591
 
0.2%
-1.8488112161
 
0.2%
ValueCountFrequency (%)
2.5438841281
 
0.2%
2.5274320861
 
0.2%
2.4616239161
 
0.2%
2.4451718731
 
0.2%
2.4287198312
0.3%
2.3629116611
 
0.2%
2.3464596181
 
0.2%
2.3300075761
 
0.2%
2.2971034913
0.5%
2.2806514481
 
0.2%

DiabetesPedigreeFunction
Real number (ℝ)

Distinct408
Distinct (%)70.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.783720974 × 10-17
Minimum-1.410612949
Maximum3.115961097
Zeros0
Zeros (%)0.0%
Negative345
Negative (%)59.4%
Memory size4.7 KiB
2022-09-20T12:10:24.770617image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1.410612949
5-th percentile-1.167202835
Q1-0.7401675481
median-0.304591555
Q30.6348860772
95-th percentile1.988587938
Maximum3.115961097
Range4.526574046
Interquartile range (IQR)1.375053625

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)1.022986756 × 1016
Kurtosis0.1335958208
Mean9.783720974 × 10-17
Median Absolute Deviation (MAD)0.5978494023
Skewness0.9150115222
Sum6.75015599 × 10-14
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:24.963620image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.59924590325
 
0.9%
-0.62913837345
 
0.9%
-0.65903084355
 
0.9%
-0.6419494325
 
0.9%
-0.62059766764
 
0.7%
-0.53092025734
 
0.7%
-0.90244095734
 
0.7%
-0.63340872624
 
0.7%
1.2113837154
 
0.7%
-0.63767907914
 
0.7%
Other values (398)537
92.4%
ValueCountFrequency (%)
-1.4106129491
0.2%
-1.3849908321
0.2%
-1.3807204792
0.3%
-1.3679094212
0.3%
-1.3636390681
0.2%
-1.3508280091
0.2%
-1.3337465981
0.2%
-1.3166651861
0.2%
-1.3123948331
0.2%
-1.308124481
0.2%
ValueCountFrequency (%)
3.1159610971
0.2%
3.1074203911
0.2%
3.0689872151
0.2%
3.0134726281
0.2%
2.957958041
0.2%
2.9366062761
0.2%
2.9323359231
0.2%
2.7700625141
0.2%
2.6206001631
0.2%
2.5138413421
0.2%

Age
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct45
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.891860487 × 10-17
Minimum-1.03469052
Maximum3.104071561
Zeros12
Zeros (%)2.1%
Negative353
Negative (%)60.8%
Memory size4.7 KiB
2022-09-20T12:10:25.146269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1.03469052
5-th percentile-1.03469052
Q1-0.8465649712
median-0.3762510983
Q30.6584394221
95-th percentile2.069381041
Maximum3.104071561
Range4.138762081
Interquartile range (IQR)1.505004393

Descriptive statistics

Standard deviation1.000861698
Coefficient of variation (CV)2.045973511 × 1016
Kurtosis0.3099974798
Mean4.891860487 × 10-17
Median Absolute Deviation (MAD)0.5643766475
Skewness1.057519908
Sum7.105427358 × 10-15
Variance1.001724138
MonotonicityNot monotonic
2022-09-20T12:10:25.328937image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
-0.940627745863
 
10.8%
-1.0346905254
 
9.3%
-0.752502196637
 
6.4%
-0.658439422134
 
5.9%
-0.846564971231
 
5.3%
-0.376251098329
 
5.0%
-0.564376647526
 
4.5%
-0.470313872926
 
4.5%
-0.282188323720
 
3.4%
-0.188125549217
 
2.9%
Other values (35)244
42.0%
ValueCountFrequency (%)
-1.0346905254
9.3%
-0.940627745863
10.8%
-0.846564971231
5.3%
-0.752502196637
6.4%
-0.658439422134
5.9%
-0.564376647526
4.5%
-0.470313872926
4.5%
-0.376251098329
5.0%
-0.282188323720
 
3.4%
-0.188125549217
 
2.9%
ValueCountFrequency (%)
3.1040715612
0.3%
3.0100087871
 
0.2%
2.9159460123
0.5%
2.8218832373
0.5%
2.7278204632
0.3%
2.6337576883
0.5%
2.5396949141
 
0.2%
2.4456321393
0.5%
2.3515693643
0.5%
2.257506592
0.3%

Interactions

2022-09-20T12:10:20.030656image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.091472image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.600914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.117485image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.521337image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.011817image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:15.522805image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.952946image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:18.500168image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:20.191322image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.292710image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.770029image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.275373image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.756263image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.176690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:15.683744image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.113308image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:18.674396image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:20.351535image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.461059image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.996790image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.429999image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.916637image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.334802image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:15.843977image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.278201image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:18.839614image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:20.517502image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.630844image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.165808image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.584328image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.067973image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.491666image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.002197image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.435458image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.003614image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:20.681582image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.797210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.323860image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.735456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.232118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.647313image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.156932image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.590217image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.162496image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:20.874843image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:08.957012image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.483933image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:11.895144image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.389210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.806017image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.317866image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.754159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.343180image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:21.057010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.127838image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.643515image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.051763image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.548382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:14.967428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.479369image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:17.924967image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.521031image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:21.295684image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.279602image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.802148image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.204460image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.700277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:15.127237image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.634612image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:18.097902image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.686486image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:21.460275image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:09.442423image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:10.962835image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:12.364005image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:13.857471image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:15.286163image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:16.793532image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:18.342016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-09-20T12:10:19.856708image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-09-20T12:10:25.570868image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-20T12:10:25.764872image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-20T12:10:25.962785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-20T12:10:26.157058image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-20T12:10:21.677292image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-20T12:10:21.884920image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

level_0PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAge
0-1.7226250.6526861.2985130.0851121.255293-0.3292440.3722150.9338111.693130
1-1.718069-0.837921-1.168433-0.4869780.500999-0.329244-0.779428-0.244807-0.094063
2-1.708959-0.837921-1.011801-0.486978-0.2532940.106980-0.532648-1.030552-1.034691
3-1.6998480.3545640.0454610.275809-0.563000-0.329244-0.943949-0.885360-0.188126
4-1.695293-0.241678-1.442538-2.0125520.878146-0.077333-0.055539-0.684653-0.564377
5-1.6907381.8451710.006303-0.190877-0.563000-0.3292440.651899-1.171473-0.282188
6-1.6861821.2489280.3978822.373473-0.563000-0.3292440.107761-0.7529792.069381
7-1.6816270.056443-0.1894861.992080-0.563000-0.3292441.030296-0.928063-0.188126
8-1.6770721.8451712.0816710.275809-0.563000-0.3292441.0961040.5494790.188126
9-1.6679610.3545642.0033550.085112-0.7561562.595212-0.9110450.7629971.787193

Last rows

level_0PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAge
5711.6847400.9508070.8677771.8013832.009586-0.3292440.108982-0.0739920.658439
5721.689295-1.1360420.3195660.085112-0.563000-0.3292440.816420-0.6419491.881255
5731.693850-0.837921-0.3461170.466505-0.563000-0.3292441.013844-0.902441-0.564377
5741.702961-0.539800-1.050959-1.2497650.123853-2.289094-0.4832921.527390-0.940628
5751.7075161.5470502.1599860.2758090.752431-0.3292442.083227-0.0227481.034691
5761.7120721.547050-1.011801-0.868372-0.563000-0.329244-1.453962-1.1373100.094063
5771.721182-0.5398000.280409-0.1055850.249568-0.3292440.898680-0.291780-0.470314
5781.7257370.3545640.2412510.085112-0.2532940.659921-0.845237-0.697464-0.188126
5791.730293-0.8379210.437040-1.059069-0.563000-0.329244-0.203607-0.2533471.410942
5801.734848-0.837921-0.855170-0.1055850.752431-0.329244-0.154251-0.398539-0.846565